home *** CD-ROM | disk | FTP | other *** search
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- NAME
- flex - fast lexical analyzer generator
-
- SYNOPSIS
- flex [-bcdfhilnpstvwBFILTV78+ -C[aefFmr] -Pprefix -Sskele-
- ton] [_f_i_l_e_n_a_m_e ...]
-
- DESCRIPTION
- _f_l_e_x is a tool for generating _s_c_a_n_n_e_r_s: programs which
- recognized lexical patterns in text. _f_l_e_x reads the given
- input files, or its standard input if no file names are
- given, for a description of a scanner to generate. The
- description is in the form of pairs of regular expressions
- and C code, called _r_u_l_e_s. _f_l_e_x generates as output a C
- source file, lex.yy.c, which defines a routine yylex(). This
- file is compiled and linked with the -lfl library to produce
- an executable. When the executable is run, it analyzes its
- input for occurrences of the regular expressions. Whenever
- it finds one, it executes the corresponding C code.
-
- For full documentation, see flexdoc(1). This manual entry is
- intended for use as a quick reference.
-
- OPTIONS
- _f_l_e_x has the following options:
-
- -b generate backing-up information to _l_e_x._b_a_c_k_u_p. This is
- a list of scanner states which require backing up and
- the input characters on which they do so. By adding
- rules one can remove backing-up states. If all
- backing-up states are eliminated and -Cf or -CF is
- used, the generated scanner will run faster.
-
- -c is a do-nothing, deprecated option included for POSIX
- compliance.
-
- NOTE: in previous releases of _f_l_e_x -c specified table-
- compression options. This functionality is now given
- by the -C flag. To ease the the impact of this change,
- when _f_l_e_x encounters -c, it currently issues a warning
- message and assumes that -C was desired instead. In
- the future this "promotion" of -c to -C will go away in
- the name of full POSIX compliance (unless the POSIX
- meaning is removed first).
-
- -d makes the generated scanner run in _d_e_b_u_g mode. When-
- ever a pattern is recognized and the global
- yy_flex_debug is non-zero (which is the default), the
- scanner will write to _s_t_d_e_r_r a line of the form:
-
- --accepting rule at line 53 ("the matched text")
-
-
-
-
- Version 2.4 Last change: November 1993 1
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- The line number refers to the location of the rule in
- the file defining the scanner (i.e., the file that was
- fed to flex). Messages are also generated when the
- scanner backs up, accepts the default rule, reaches the
- end of its input buffer (or encounters a NUL; the two
- look the same as far as the scanner's concerned), or
- reaches an end-of-file.
-
- -f specifies _f_a_s_t _s_c_a_n_n_e_r. No table compression is done
- and stdio is bypassed. The result is large but fast.
- This option is equivalent to -Cfr (see below).
-
- -h generates a "help" summary of _f_l_e_x'_s options to _s_t_d_e_r_r
- and then exits.
-
- -i instructs _f_l_e_x to generate a _c_a_s_e-_i_n_s_e_n_s_i_t_i_v_e scanner.
- The case of letters given in the _f_l_e_x input patterns
- will be ignored, and tokens in the input will be
- matched regardless of case. The matched text given in
- _y_y_t_e_x_t will have the preserved case (i.e., it will not
- be folded).
-
- -l turns on maximum compatibility with the original AT&T
- lex implementation, at a considerable performance cost.
- This option is incompatible with -+, -f, -F, -Cf, or
- -CF. See _f_l_e_x_d_o_c(_1) for details.
-
- -n is another do-nothing, deprecated option included only
- for POSIX compliance.
-
- -p generates a performance report to stderr. The report
- consists of comments regarding features of the _f_l_e_x
- input file which will cause a loss of performance in
- the resulting scanner. If you give the flag twice, you
- will also get comments regarding features that lead to
- minor performance losses.
-
- -s causes the _d_e_f_a_u_l_t _r_u_l_e (that unmatched scanner input
- is echoed to _s_t_d_o_u_t) to be suppressed. If the scanner
- encounters input that does not match any of its rules,
- it aborts with an error.
-
- -t instructs _f_l_e_x to write the scanner it generates to
- standard output instead of lex.yy.c.
-
- -v specifies that _f_l_e_x should write to _s_t_d_e_r_r a summary of
- statistics regarding the scanner it generates.
-
- -w suppresses warning messages.
-
- -B instructs _f_l_e_x to generate a _b_a_t_c_h scanner instead of
- an _i_n_t_e_r_a_c_t_i_v_e scanner (see -I below). See _f_l_e_x_d_o_c(_1)
-
-
-
- Version 2.4 Last change: November 1993 2
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- for details. Scanners using -Cf or -CF compression
- options automatically specify this option, too.
-
- -F specifies that the _f_a_s_t scanner table representation
- should be used (and stdio bypassed). This representa-
- tion is about as fast as the full table representation
- (-f), and for some sets of patterns will be consider-
- ably smaller (and for others, larger). It cannot be
- used with the -+ option. See flexdoc(1) for more
- details.
-
- This option is equivalent to -CFr (see below).
-
- -I instructs _f_l_e_x to generate an _i_n_t_e_r_a_c_t_i_v_e scanner, that
- is, a scanner which stops immediately rather than look-
- ing ahead if it knows that the currently scanned text
- cannot be part of a longer rule's match. This is the
- opposite of _b_a_t_c_h scanners (see -B above). See flex-
- doc(1) for details.
-
- Note, -I cannot be used in conjunction with _f_u_l_l or
- _f_a_s_t _t_a_b_l_e_s, i.e., the -f, -F, -Cf, or -CF flags. For
- other table compression options, -I is the default.
-
- -L instructs _f_l_e_x not to generate #line directives in
- lex.yy.c. The default is to generate such directives so
- error messages in the actions will be correctly located
- with respect to the original _f_l_e_x input file, and not
- to the fairly meaningless line numbers of lex.yy.c.
-
- -T makes _f_l_e_x run in _t_r_a_c_e mode. It will generate a lot
- of messages to _s_t_d_e_r_r concerning the form of the input
- and the resultant non-deterministic and deterministic
- finite automata. This option is mostly for use in
- maintaining _f_l_e_x.
-
- -V prints the version number to _s_t_d_e_r_r and exits.
-
- -7 instructs _f_l_e_x to generate a 7-bit scanner, which can
- save considerable table space, especially when using
- -Cf or -CF (and, at most sites, -7 is on by default for
- these options. To see if this is the case, use the -v
- verbose flag and check the flag summary it reports).
-
- -8 instructs _f_l_e_x to generate an 8-bit scanner. This is
- the default except for the -Cf and -CF compression
- options, for which the default is site-dependent, and
- can be checked by inspecting the flag summary generated
- by the -v option.
-
- -+ specifies that you want flex to generate a C++ scanner
- class. See the section on Generating C++ Scanners in
-
-
-
- Version 2.4 Last change: November 1993 3
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- _f_l_e_x_d_o_c(_1) for details.
-
- -C[aefFmr]
- controls the degree of table compression and scanner
- optimization.
-
- -Ca trade off larger tables in the generated scanner
- for faster performance because the elements of the
- tables are better aligned for memory access and compu-
- tation. This option can double the size of the tables
- used by your scanner.
-
- -Ce directs _f_l_e_x to construct _e_q_u_i_v_a_l_e_n_c_e _c_l_a_s_s_e_s,
- i.e., sets of characters which have identical lexical
- properties. Equivalence classes usually give dramatic
- reductions in the final table/object file sizes (typi-
- cally a factor of 2-5) and are pretty cheap
- performance-wise (one array look-up per character
- scanned).
-
- -Cf specifies that the _f_u_l_l scanner tables should be
- generated - _f_l_e_x should not compress the tables by tak-
- ing advantages of similar transition functions for dif-
- ferent states.
-
- -CF specifies that the alternate fast scanner represen-
- tation (described in flexdoc(1)) should be used. This
- option cannot be used with -+.
-
- -Cm directs _f_l_e_x to construct _m_e_t_a-_e_q_u_i_v_a_l_e_n_c_e _c_l_a_s_s_e_s,
- which are sets of equivalence classes (or characters,
- if equivalence classes are not being used) that are
- commonly used together. Meta-equivalence classes are
- often a big win when using compressed tables, but they
- have a moderate performance impact (one or two "if"
- tests and one array look-up per character scanned).
-
- -Cr causes the generated scanner to _b_y_p_a_s_s using stdio
- for input. In general this option results in a minor
- performance gain only worthwhile if used in conjunction
- with -Cf or -CF. It can cause surprising behavior if
- you use stdio yourself to read from _y_y_i_n prior to cal-
- ling the scanner.
-
- A lone -C specifies that the scanner tables should be
- compressed but neither equivalence classes nor meta-
- equivalence classes should be used.
-
- The options -Cf or -CF and -Cm do not make sense
- together - there is no opportunity for meta-equivalence
- classes if the table is not being compressed. Other-
- wise the options may be freely mixed.
-
-
-
- Version 2.4 Last change: November 1993 4
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- The default setting is -Cem, which specifies that _f_l_e_x
- should generate equivalence classes and meta-
- equivalence classes. This setting provides the highest
- degree of table compression. You can trade off
- faster-executing scanners at the cost of larger tables
- with the following generally being true:
-
- slowest & smallest
- -Cem
- -Cm
- -Ce
- -C
- -C{f,F}e
- -C{f,F}
- -C{f,F}a
- fastest & largest
-
-
- -C options are cumulative.
-
- -Pprefix
- changes the default _y_y prefix used by _f_l_e_x to be _p_r_e_f_i_x
- instead. See _f_l_e_x_d_o_c(_1) for a description of all the
- global variables and file names that this affects.
-
- -Sskeleton_file
- overrides the default skeleton file from which _f_l_e_x
- constructs its scanners. You'll never need this option
- unless you are doing _f_l_e_x maintenance or development.
-
- SUMMARY OF FLEX REGULAR EXPRESSIONS
- The patterns in the input are written using an extended set
- of regular expressions. These are:
-
- x match the character 'x'
- . any character except newline
- [xyz] a "character class"; in this case, the pattern
- matches either an 'x', a 'y', or a 'z'
- [abj-oZ] a "character class" with a range in it; matches
- an 'a', a 'b', any letter from 'j' through 'o',
- or a 'Z'
- [^A-Z] a "negated character class", i.e., any character
- but those in the class. In this case, any
- character EXCEPT an uppercase letter.
- [^A-Z\n] any character EXCEPT an uppercase letter or
- a newline
- r* zero or more r's, where r is any regular expression
- r+ one or more r's
- r? zero or one r's (that is, "an optional r")
- r{2,5} anywhere from two to five r's
- r{2,} two or more r's
- r{4} exactly 4 r's
-
-
-
- Version 2.4 Last change: November 1993 5
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- {name} the expansion of the "name" definition
- (see above)
- "[xyz]\"foo"
- the literal string: [xyz]"foo
- \X if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
- then the ANSI-C interpretation of \x.
- Otherwise, a literal 'X' (used to escape
- operators such as '*')
- \123 the character with octal value 123
- \x2a the character with hexadecimal value 2a
- (r) match an r; parentheses are used to override
- precedence (see below)
-
-
- rs the regular expression r followed by the
- regular expression s; called "concatenation"
-
-
- r|s either an r or an s
-
-
- r/s an r but only if it is followed by an s. The
- s is not part of the matched text. This type
- of pattern is called as "trailing context".
- ^r an r, but only at the beginning of a line
- r$ an r, but only at the end of a line. Equivalent
- to "r/\n".
-
-
- <s>r an r, but only in start condition s (see
- below for discussion of start conditions)
- <s1,s2,s3>r
- same, but in any of start conditions s1,
- s2, or s3
- <*>r an r in any start condition, even an exclusive one.
-
-
- <<EOF>> an end-of-file
- <s1,s2><<EOF>>
- an end-of-file when in start condition s1 or s2
-
- The regular expressions listed above are grouped according
- to precedence, from highest precedence at the top to lowest
- at the bottom. Those grouped together have equal pre-
- cedence.
-
- Some notes on patterns:
-
- - Negated character classes _m_a_t_c_h _n_e_w_l_i_n_e_s unless "\n"
- (or an equivalent escape sequence) is one of the char-
- acters explicitly present in the negated character
- class (e.g., "[^A-Z\n]").
-
-
-
- Version 2.4 Last change: November 1993 6
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- - A rule can have at most one instance of trailing con-
- text (the '/' operator or the '$' operator). The start
- condition, '^', and "<<EOF>>" patterns can only occur
- at the beginning of a pattern, and, as well as with '/'
- and '$', cannot be grouped inside parentheses. The
- following are all illegal:
-
- foo/bar$
- foo|(bar$)
- foo|^bar
- <sc1>foo<sc2>bar
-
-
- SUMMARY OF SPECIAL ACTIONS
- In addition to arbitrary C code, the following can appear in
- actions:
-
- - ECHO copies yytext to the scanner's output.
-
- - BEGIN followed by the name of a start condition places
- the scanner in the corresponding start condition.
-
- - REJECT directs the scanner to proceed on to the "second
- best" rule which matched the input (or a prefix of the
- input). yytext and yyleng are set up appropriately.
- Note that REJECT is a particularly expensive feature in
- terms scanner performance; if it is used in _a_n_y of the
- scanner's actions it will slow down _a_l_l of the
- scanner's matching. Furthermore, REJECT cannot be used
- with the -f or -F options.
-
- Note also that unlike the other special actions, REJECT
- is a _b_r_a_n_c_h; code immediately following it in the
- action will _n_o_t be executed.
-
- - yymore() tells the scanner that the next time it
- matches a rule, the corresponding token should be
- _a_p_p_e_n_d_e_d onto the current value of yytext rather than
- replacing it.
-
- - yyless(n) returns all but the first _n characters of the
- current token back to the input stream, where they will
- be rescanned when the scanner looks for the next match.
- yytext and yyleng are adjusted appropriately (e.g.,
- yyleng will now be equal to _n ).
-
- - unput(c) puts the character _c back onto the input
- stream. It will be the next character scanned.
-
- - input() reads the next character from the input stream
- (this routine is called yyinput() if the scanner is
- compiled using C++).
-
-
-
- Version 2.4 Last change: November 1993 7
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- - yyterminate() can be used in lieu of a return statement
- in an action. It terminates the scanner and returns a
- 0 to the scanner's caller, indicating "all done".
-
- By default, yyterminate() is also called when an end-
- of-file is encountered. It is a macro and may be rede-
- fined.
-
- - YY_NEW_FILE is an action available only in <<EOF>>
- rules. It means "Okay, I've set up a new input file,
- continue scanning". It is no longer required; you can
- just assign _y_y_i_n to point to a new file in the <<EOF>>
- action.
-
- - yy_create_buffer( file, size ) takes a _F_I_L_E pointer and
- an integer _s_i_z_e. It returns a YY_BUFFER_STATE handle to
- a new input buffer large enough to accomodate _s_i_z_e
- characters and associated with the given file. When in
- doubt, use YY_BUF_SIZE for the size.
-
- - yy_switch_to_buffer( new_buffer ) switches the
- scanner's processing to scan for tokens from the given
- buffer, which must be a YY_BUFFER_STATE.
-
- - yy_delete_buffer( buffer ) deletes the given buffer.
-
- VALUES AVAILABLE TO THE USER
- - char *yytext holds the text of the current token. It
- may be modified but not lengthened (you cannot append
- characters to the end). Modifying the last character
- may affect the activity of rules anchored using '^'
- during the next scan; see flexdoc(1) for details.
-
- If the special directive %array appears in the first
- section of the scanner description, then yytext is
- instead declared char yytext[YYLMAX], where YYLMAX is a
- macro definition that you can redefine in the first
- section if you don't like the default value (generally
- 8KB). Using %array results in somewhat slower
- scanners, but the value of yytext becomes immune to
- calls to _i_n_p_u_t() and _u_n_p_u_t(), which potentially destroy
- its value when yytext is a character pointer. The
- opposite of %array is %pointer, which is the default.
-
- You cannot use %array when generating C++ scanner
- classes (the -+ flag).
-
- - int yyleng holds the length of the current token.
-
- - FILE *yyin is the file which by default _f_l_e_x reads
- from. It may be redefined but doing so only makes
- sense before scanning begins or after an EOF has been
-
-
-
- Version 2.4 Last change: November 1993 8
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- encountered. Changing it in the midst of scanning will
- have unexpected results since _f_l_e_x buffers its input;
- use yyrestart() instead. Once scanning terminates
- because an end-of-file has been seen, you can assign
- _y_y_i_n at the new input file and then call the scanner
- again to continue scanning.
-
- - void yyrestart( FILE *new_file ) may be called to point
- _y_y_i_n at the new input file. The switch-over to the new
- file is immediate (any previously buffered-up input is
- lost). Note that calling yyrestart() with _y_y_i_n as an
- argument thus throws away the current input buffer and
- continues scanning the same input file.
-
- - FILE *yyout is the file to which ECHO actions are done.
- It can be reassigned by the user.
-
- - YY_CURRENT_BUFFER returns a YY_BUFFER_STATE handle to
- the current buffer.
-
- - YY_START returns an integer value corresponding to the
- current start condition. You can subsequently use this
- value with BEGIN to return to that start condition.
-
- MACROS AND FUNCTIONS YOU CAN REDEFINE
- - YY_DECL controls how the scanning routine is declared.
- By default, it is "int yylex()", or, if prototypes are
- being used, "int yylex(void)". This definition may be
- changed by redefining the "YY_DECL" macro. Note that
- if you give arguments to the scanning routine using a
- K&R-style/non-prototyped function declaration, you must
- terminate the definition with a semi-colon (;).
-
- - The nature of how the scanner gets its input can be
- controlled by redefining the YY_INPUT macro.
- YY_INPUT's calling sequence is
- "YY_INPUT(buf,result,max_size)". Its action is to
- place up to _m_a_x__s_i_z_e characters in the character array
- _b_u_f and return in the integer variable _r_e_s_u_l_t either
- the number of characters read or the constant YY_NULL
- (0 on Unix systems) to indicate EOF. The default
- YY_INPUT reads from the global file-pointer "yyin". A
- sample redefinition of YY_INPUT (in the definitions
- section of the input file):
-
- %{
- #undef YY_INPUT
- #define YY_INPUT(buf,result,max_size) \
- { \
- int c = getchar(); \
- result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
- }
-
-
-
- Version 2.4 Last change: November 1993 9
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- %}
-
-
- - When the scanner receives an end-of-file indication
- from YY_INPUT, it then checks the function yywrap()
- function. If yywrap() returns false (zero), then it is
- assumed that the function has gone ahead and set up
- _y_y_i_n to point to another input file, and scanning con-
- tinues. If it returns true (non-zero), then the
- scanner terminates, returning 0 to its caller.
-
- The default yywrap() always returns 1.
-
- - YY_USER_ACTION can be redefined to provide an action
- which is always executed prior to the matched rule's
- action.
-
- - The macro YY_USER_INIT may be redefined to provide an
- action which is always executed before the first scan.
-
- - In the generated scanner, the actions are all gathered
- in one large switch statement and separated using
- YY_BREAK, which may be redefined. By default, it is
- simply a "break", to separate each rule's action from
- the following rule's.
-
- FILES
- -lfl library with which to link scanners to obtain the
- default versions of _y_y_w_r_a_p() and/or _m_a_i_n().
-
- _l_e_x._y_y._c
- generated scanner (called _l_e_x_y_y._c on some systems).
-
- _l_e_x._y_y._c_c
- generated C++ scanner class, when using -+.
-
- <_F_l_e_x_L_e_x_e_r._h>
- header file defining the C++ scanner base class, Flex-
- Lexer, and its derived class, yyFlexLexer.
-
- _f_l_e_x._s_k_l
- skeleton scanner. This file is only used when building
- flex, not when flex executes.
-
- _l_e_x._b_a_c_k_u_p
- backing-up information for -b flag (called _l_e_x._b_c_k on
- some systems).
-
- SEE ALSO
- flexdoc(1), lex(1), yacc(1), sed(1), awk(1).
-
-
-
-
-
- Version 2.4 Last change: November 1993 10
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- M. E. Lesk and E. Schmidt, _L_E_X - _L_e_x_i_c_a_l _A_n_a_l_y_z_e_r _G_e_n_e_r_a_t_o_r
-
- DIAGNOSTICS
- _r_e_j_e_c_t__u_s_e_d__b_u_t__n_o_t__d_e_t_e_c_t_e_d _u_n_d_e_f_i_n_e_d or
-
- _y_y_m_o_r_e__u_s_e_d__b_u_t__n_o_t__d_e_t_e_c_t_e_d _u_n_d_e_f_i_n_e_d - These errors can
- occur at compile time. They indicate that the scanner uses
- REJECT or yymore() but that _f_l_e_x failed to notice the fact,
- meaning that _f_l_e_x scanned the first two sections looking for
- occurrences of these actions and failed to find any, but
- somehow you snuck some in (via a #include file, for exam-
- ple). Make an explicit reference to the action in your _f_l_e_x
- input file. (Note that previously _f_l_e_x supported a
- %used/%unused mechanism for dealing with this problem; this
- feature is still supported but now deprecated, and will go
- away soon unless the author hears from people who can argue
- compellingly that they need it.)
-
- _f_l_e_x _s_c_a_n_n_e_r _j_a_m_m_e_d - a scanner compiled with -s has encoun-
- tered an input string which wasn't matched by any of its
- rules.
-
- _w_a_r_n_i_n_g, _r_u_l_e _c_a_n_n_o_t _b_e _m_a_t_c_h_e_d indicates that the given
- rule cannot be matched because it follows other rules that
- will always match the same text as it. See _f_l_e_x_d_o_c(_1) for
- an example.
-
- _w_a_r_n_i_n_g, -s _o_p_t_i_o_n _g_i_v_e_n _b_u_t _d_e_f_a_u_l_t _r_u_l_e _c_a_n _b_e _m_a_t_c_h_e_d
- means that it is possible (perhaps only in a particular
- start condition) that the default rule (match any single
- character) is the only one that will match a particular
- input. Since
-
- _s_c_a_n_n_e_r _i_n_p_u_t _b_u_f_f_e_r _o_v_e_r_f_l_o_w_e_d - a scanner rule matched
- more text than the available dynamic memory.
-
- _t_o_k_e_n _t_o_o _l_a_r_g_e, _e_x_c_e_e_d_s _Y_Y_L_M_A_X - your scanner uses %array
- and one of its rules matched a string longer than the YYLMAX
- constant (8K bytes by default). You can increase the value
- by #define'ing YYLMAX in the definitions section of your
- _f_l_e_x input.
-
- _s_c_a_n_n_e_r _r_e_q_u_i_r_e_s -_8 _f_l_a_g _t_o _u_s_e _t_h_e _c_h_a_r_a_c_t_e_r '_x' - Your
- scanner specification includes recognizing the 8-bit charac-
- ter '_x' and you did not specify the -8 flag, and your
- scanner defaulted to 7-bit because you used the -Cf or -CF
- table compression options.
-
- _f_l_e_x _s_c_a_n_n_e_r _p_u_s_h-_b_a_c_k _o_v_e_r_f_l_o_w - you used unput() to push
- back so much text that the scanner's buffer could not hold
- both the pushed-back text and the current token in yytext.
- Ideally the scanner should dynamically resize the buffer in
-
-
-
- Version 2.4 Last change: November 1993 11
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- this case, but at present it does not.
-
- _i_n_p_u_t _b_u_f_f_e_r _o_v_e_r_f_l_o_w, _c_a_n'_t _e_n_l_a_r_g_e _b_u_f_f_e_r _b_e_c_a_u_s_e _s_c_a_n_n_e_r
- _u_s_e_s _R_E_J_E_C_T - the scanner was working on matching an
- extremely large token and needed to expand the input buffer.
- This doesn't work with scanners that use REJECT.
-
- _f_a_t_a_l _f_l_e_x _s_c_a_n_n_e_r _i_n_t_e_r_n_a_l _e_r_r_o_r--_e_n_d _o_f _b_u_f_f_e_r _m_i_s_s_e_d -
- This can occur in an scanner which is reentered after a
- long-jump has jumped out (or over) the scanner's activation
- frame. Before reentering the scanner, use:
-
- yyrestart( yyin );
-
- or use C++ scanner classes (the -+ option), which are fully
- reentrant.
-
- AUTHOR
- Vern Paxson, with the help of many ideas and much inspira-
- tion from Van Jacobson. Original version by Jef Poskanzer.
-
- See flexdoc(1) for additional credits and the address to
- send comments to.
-
- DEFICIENCIES / BUGS
- Some trailing context patterns cannot be properly matched
- and generate warning messages ("dangerous trailing con-
- text"). These are patterns where the ending of the first
- part of the rule matches the beginning of the second part,
- such as "zx*/xy*", where the 'x*' matches the 'x' at the
- beginning of the trailing context. (Note that the POSIX
- draft states that the text matched by such patterns is unde-
- fined.)
-
- For some trailing context rules, parts which are actually
- fixed-length are not recognized as such, leading to the
- abovementioned performance loss. In particular, parts using
- '|' or {n} (such as "foo{3}") are always considered
- variable-length.
-
- Combining trailing context with the special '|' action can
- result in _f_i_x_e_d trailing context being turned into the more
- expensive _v_a_r_i_a_b_l_e trailing context. For example, in the
- following:
-
- %%
- abc |
- xyz/def
-
-
- Use of unput() or input() invalidates yytext and yyleng,
- unless the %array directive or the -l option has been used.
-
-
-
- Version 2.4 Last change: November 1993 12
-
-
-
-
-
-
- FLEX(1) USER COMMANDS FLEX(1)
-
-
-
- Use of unput() to push back more text than was matched can
- result in the pushed-back text matching a beginning-of-line
- ('^') rule even though it didn't come at the beginning of
- the line (though this is rare!).
-
- Pattern-matching of NUL's is substantially slower than
- matching other characters.
-
- Dynamic resizing of the input buffer is slow, as it entails
- rescanning all the text matched so far by the current (gen-
- erally huge) token.
-
- _f_l_e_x does not generate correct #line directives for code
- internal to the scanner; thus, bugs in _f_l_e_x._s_k_l yield bogus
- line numbers.
-
- Due to both buffering of input and read-ahead, you cannot
- intermix calls to <stdio.h> routines, such as, for example,
- getchar(), with _f_l_e_x rules and expect it to work. Call
- input() instead.
-
- The total table entries listed by the -v flag excludes the
- number of table entries needed to determine what rule has
- been matched. The number of entries is equal to the number
- of DFA states if the scanner does not use REJECT, and some-
- what greater than the number of states if it does.
-
- REJECT cannot be used with the -f or -F options.
-
- The _f_l_e_x internal algorithms need documentation.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Version 2.4 Last change: November 1993 13
-
-
-
-